AITopics | hs 1

Collaborating Authors

hs 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ec1f764517b7ffb52057af6df18142b7-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 18:17:46 GMT

Theeventsintasksisdenoted as Hs = Hs,n+1 and all the events upto and including stages is denoted asH1:s = ss0=1Hs0. Let X denote that the proportionality constant isindependentofX (possiblyaset). Using the relation between the history-independent and history-dependent entropy terms we obtain the second inequality. The right side term in the first inequality holds due to Cauchy-Schwarz. The inequality holds due to Weyl's inequality and This is true by using the upper bounds onσ2max(ˆΣs,t) in Lemma 3, and because the functionp x/log(1+ax)fora>0increaseswithx.

artificial intelligence, hs 1, information, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Entropic bounds for conditionally Gaussian vectors and applications to neural networks

Celli, Lucia, Peccati, Giovanni

arXiv.org Machine LearningApr-11-2025

Using entropic inequalities from information theory, we provide new bounds on the total variation and 2-Wasserstein distances between a conditionally Gaussian law and a Gaussian law with invertible covariance matrix. We apply our results to quantify the speed of convergence to Gaussian of a randomly initialized fully connected neural network and its derivatives - evaluated in a finite number of inputs - when the initialization is Gaussian and the sizes of the inner layers diverge to infinity. Our results require mild assumptions on the activation function, and allow one to recover optimal rates of convergence in a variety of distances, thus improving and extending the findings of Basteri and Trevisan (2023), Favaro et al. (2023), Trevisan (2024) and Apollonio et al. (2024). One of our main tools are the quantitative cumulant estimates established in Hanin (2024). As an illustration, we apply our results to bound the total variation distance between the Bayesian posterior law of the neural network and its derivatives, and the posterior law of the corresponding Gaussian limit: this yields quantitative versions of a posterior CLT by Hron et al. (2022), and extends several estimates by Trevisan (2024) to the total variation metric.

artificial intelligence, machine learning, nullk 1, (17 more...)

arXiv.org Machine Learning

2504.08335

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.74)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Secure Tuning: Mitigating Security Risks Arising from Benign Instruction Fine-Tuning

Du, Yanrui, Zhao, Sendong, Cao, Jiawei, Ma, Ming, Zhao, Danyang, Fan, Fenglei, Liu, Ting, Qin, Bing

arXiv.org Artificial IntelligenceOct-6-2024

Instruction Fine-Tuning (IFT) has become an essential method for adapting base Large Language Models (LLMs) into variants for professional and private use. However, researchers have raised concerns over a significant decrease in LLMs' security following IFT, even when the IFT process involves entirely benign instructions (termed Benign IFT). Our study represents a pioneering effort to mitigate the security risks arising from Benign IFT. Specifically, we conduct a Module Robustness Analysis, aiming to investigate how LLMs' internal modules contribute to their security. Based on our analysis, we propose a novel IFT strategy, called the Modular Layer-wise Learning Rate (ML-LR) strategy. In our analysis, we implement a simple security feature classifier that serves as a proxy to measure the robustness of modules (e.g. $Q$/$K$/$V$, etc.). Our findings reveal that the module robustness shows clear patterns, varying regularly with the module type and the layer depth. Leveraging these insights, we develop a proxy-guided search algorithm to identify a robust subset of modules, termed Mods$_{Robust}$. During IFT, the ML-LR strategy employs differentiated learning rates for Mods$_{Robust}$ and the rest modules. Our experimental results show that in security assessments, the application of our ML-LR strategy significantly mitigates the rise in harmfulness of LLMs following Benign IFT. Notably, our ML-LR strategy has little impact on the usability or expertise of LLMs following Benign IFT. Furthermore, we have conducted comprehensive analyses to verify the soundness and flexibility of our ML-LR strategy.

arxiv preprint arxiv, llm, module, (14 more...)

arXiv.org Artificial Intelligence

2410.04524

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Estimation of Riemannian distances between covariance operators and Gaussian processes

Minh, Ha Quang

arXiv.org Machine LearningAug-26-2021

In this work we study two Riemannian distances between infinite-dimensional positive definite Hilbert-Schmidt operators, namely affine-invariant Riemannian and Log-Hilbert-Schmidt distances, in the context of covariance operators associated with functional stochastic processes, in particular Gaussian processes. Our first main results show that both distances converge in the Hilbert-Schmidt norm. Using concentration results for Hilbert space-valued random variables, we then show that both distances can be consistently and efficiently estimated from (i) sample covariance operators, (ii) finite, normalized covariance matrices, and (iii) finite samples generated by the given processes, all with dimension-independent convergence. Our theoretical analysis exploits extensively the methodology of reproducing kernel Hilbert space (RKHS) covariance and cross-covariance operators. The theoretical formulation is illustrated with numerical experiments on covariance operators of Gaussian processes.

covariance operator, operator, sym, (16 more...)

arXiv.org Machine Learning

2108.11683

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Michigan (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback